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Horizontal transfer (HT) of transposable elements has been recognized to be a major force driving genomic 
variation and biological innovation of eukaryotic organisms. However, the mechanisms of HT in eukaiyotes 
remain poorly appreciated. The non-autonomous Helitron family, Lepl, has been found to be widespread in 
lepidopteran species, and showed little interspecific sequence similarity of acquired sequences at 3' end, 
which makes Lepl a good candidate for the study of HT. In this study, we describe the Lepl -like elements in 
multiple non-lepidopteran species, including two aphids, Acyrthosiphon pisum and Aphis gossypii, two 
parasitoid wasps, Cotesia vestalis, and Copidosoma floridanum, one beetle, Anoplophora glabripennis, as 
well as two bracoviruses in parasitoid wasps, and one intracellular microsporidia parasite, Nosema 
bombycis. The patchy distribution and high sequence similarity of lepl-like elements among distantly 
related lineages as well as incongruence of Lepl -like elements and host phylogeny suggest the occurrence of 
HT. Remarkably, the acquired sequences of both NbLepl from N. bombycis and CfLepl from C. floridanum 
showed over 90% identity with their lepidopteran host Lepl. Thus, our study provides evidence of HT 
facUitated by host-parasite interactions. Furthermore, in the context of these data, we discuss the putative 
directions and vectors of HT of Lepl Helitrons. 



Transposable elements (TEs) are prevalent in the genomes of almost all eukaryotes and are traditionally 
categorized based on their mode of transposition as class-I elements or retrotransposons and class-II 
elements or DNA transposons'. Copy and paste retrotransposons replicate via an RNA intermediate, which 
is reverse transcribed prior to its reintegration into the genome, whereas DNA transposons move through a single 
or double-stranded DNA intermediate and were divided into three major subclasses including the classic "cut- 
and-paste" transposons, rolling-circle (RC) transposons called Helitrons, and Mavericks, whose mechanism of 
transposition is not yet well characterized, but that likely replicate using a self-encoded DNA polymerase^. 

The inherent mobility and replication abilities of TEs make them particularly prone to transfer horizontally 
between organisms to avoid co-evolved host suppression mechanisms leading to vertical inactivation^''. 
Horizontal transfer (HT) can be defined as the exchange of genetic material between species by nonvertical 
inheritance without the aid of any form of sexual mechanism^. Over 200 solid cases of horizontal transfers of TEs 
(horizontal transposon transfer or HTT) have been described so far in multicellular eukaryotes*'^ with the 
majority of HTT cases involving drosophUid flies, and it is believed that TEs rely heavily on HT for their 
propagation and maintenance throughout evolution"'. However, despite mounting examples of HTT, the 
unequivocal confirmation of any specific mechanism acting to shuttle DNA among eukaryotes remains poorly 
appreciated. 

Helitrons, a new superfamily of transposons, have recently been uncovered by the computational analysis of 
genomic sequences of Arabidopsis thaliana, Oryza sativa and Caenorhabditis elegans^". Unlike traditional class 
DNA TEs, Helitrons are unique in that they do not produce target site duplications on their integration into the 
host genome and do not contain terminal repeats, and thus are difficult to be identified" '\ However, Helitrons 
have conserved sequence features including a "TC" motif on the 5 '-end and a "CTRR" motif on the 3 '-end, and 
contain a palindromic sequence of 16-20 bp near the 3'-terminus, which can form a hairpin structure'" '*. In 
addition, Helitrons tend to insert preferentially between host nucleotides adenine and thymidine'"''^. The non- 
autonomous Helitrons, Lepl, were originally identified within intron and untranslated regions from eight lep- 
idopteran species"", and subsequently described as lepidopteran-specific common sequence 3(LSCS3)'^. Recent 
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study showed that Lepl Helitrons were widespread in more than 30 
lepidopteran species, and estimated to occupy 1.3 X 10"^ of the 
Bombyx mori genome sequence'". 

Although an increasing number of Lepl elements are being iden- 
tified in lepidopteran genomes, little is known about Lepl in non- 
lepidopteran insect species. In this study, we report the presence of 
Lepl -like elements in several non-lepidopteran insect species and 
other distantly related organisms. Our results suggested that the 
Lepl Helitrons can undergo horizontal transfer by diverse means. 

Results and Discussion 

Evolutionary dynamics of Lepl in Helicoverpa armigera and its 
related species. While Lepl Helitrons have been previously described 
in multiple lepidopteran insects, the evolutionary dynamics of Lepl 
had not been further investigated. In this study, a lepl -like sequence 
(named HaLepl_l) was identified in H. armigera by genome walking 
and subsequent sequence analysis. The HaLepl_l element is 193 bp 
in length and located at 756 bp upstream of the translation start 
codon of the CYP6AE12 gene in the reverse orientation. A total of 
21 full length sequences with high homology to HaLepl_l were 
identified from non-redundant database, and named HaLepl_2- 
HaLepl_22 (Table SI). Figure SI shows the alignment of these 
sequences. As shown in Figure SI, these sequences present the 
typical structural features of the Lepl elements: almost all HaLepl 
copies have characteristic 5'-TC and 3'-CTRY nucleotide termini as 
well as CTRR motif at the 3' end of acquired sequence. The 
integration occurs precisely between the host A and T nucleotides, 
without duplications or deletions of the target sites, consistent with 
the RC mechanism. The phylogeny was constructed based on 
nucleotide sequences of all these HaLepl elements. Neighbor- 
joining (NJ) analysis demonstrated the presence of three clear 
major lineages (Fig. S2), designated Lineage A {HaLeplA), Lineage 
B (HaLeplB), and Lineage C (HaLeplC), among which, 6 elements 
form lineage HaLeplA, while HaLeplB and HaLeplC were 
represented by 6 and 9 elements, respectively. Notably, HaLepl 
elements from Lineage A and Lineage B showed relatively high 
identity with 134 bp Lepl consensus sequence (83%-89%), while 
HaLepl elements from Lineage C showed only 68% to 78% 
identity with Lepl consensus sequence (Table SI). These results 
suggested that HaLepl lineages might transfer independently into 
the genome of H. armigera. 

The HaLepl_l sequence was used as a query to search against 
nucleotide (nr/nt) and EST {est_others) collections to detect 
sequences with high identity with HaLepl_l in lepidopteran species 
other than H. armigera. The result showed that HaLepl_l sequence 
shared the highest similarity with two species of Heliothinae includ- 
ing Helicoverpa zea and Heliothis virescens. For example, three 
sequences from H. zea (accession number: EF152213, EF152207 
and HQ840515) were identified from nucleotide (nr/nt) database 
to have over 93% identity with HaLepl_l. A total of 103 matches 
were detected in H. virescens EST database with an E-value less than 
le~™. Representative examples of these sequences are shown in 
Figure S3. Remarkably, the acquired sequence at 3' end was only 
found in H. zea and H. virescens. Further analysis showed that the 
acquired sequences at 3' end of all other HaLepl elements were also 
conserved only in H. zea and H. virescens (Table S2), suggesting that 
the acquired sequence was unique to H. armigera and its closely 
related species. These results consist with previous finding that the 
acquired sequence at 3' end of Lepl elements shared little interspe- 
cific sequence similarity, while high similarity was only found within 
species or closely related species'". 

To understand whether HaLepl elements mobilized recently, the 
insertion polymorphism was assessed experimentally or by homo- 
logy searches. The results of PGR and subsequent sequencing of 
DNA products showed that in samples of 12 individuals, the per- 
centage of individuals with the band for HaLepl_l insertion was 25% 



(Fig. S4A). Paralogous or orthologous empty sites were also analyzed 
using homology searches. The results showed that no Lepl -like 
sequence was found in paralogous sites of HaLepl_20 (accession 
number: FP340435) in H. armigera as well as in orthologous site of 
HaLepl_8 in H. zea (accession number: DQ788839) (Fig. S4B, C). 
The H. armigera is a pest widespread across the Old World from the 
Western Pacific to the Canary Islands, while H. zea is found through- 
out the warm regions of the New World and in Hawaii''^ and is 
recently thought to be derived from a founder population of H. 
armigera approximately 1.5 million years ago^". The intra-species 
insertion polymorphism of HaLepl_l suggested a very recent trans- 
position. The insertion polymorphism of HaLepl_8 in two different 
but closely related species suggested that HaLepl_8 might horizont- 
ally transfer into a common ancestor of H. armigera and H. zea, and 
the absence of orthologous copy in H. zea was due to the fact that the 
element had been actively transposing some time after the split of 
these two species, or to the differential fixation or loss of ancestrally 
polymorphic insertions in these two species. Further research is 
necessary to identify the parent TE of the non-autonomous 
HaLepl elements. 

Identification of Lepl-like sequences in non-lepidopteran species. 

To characterize the distribution of Lepl-like elements in non- 
lepidopteran insect species, Lepl consensus sequence was used as 
query in Blastn searches against insect genome assembly. While no 
significant hits were detected in the genomes of red flour beetle, 
Tribolium castaneum (Coleoptera: Tenebrionidae), the blood- 
sucking bug, Rhodnius prolixus (Hemiptera: reduviidae), the 
human body louse, Pediculus humanus (Phthiraptera: Pediculidae), 
the honey bee. Apis mellifera (Hymenoptera: Apidae), the parasitoid 
wasp Nasonia vitripennis (Hymenoptera: Pteromalidae), and six ants 
(Hymenoptera: Formicidae) including Camponotus floridanus, 
Linepithema humile, Pogonomyrmex barbatus, Atta cephalotes, 
Harpegnathos saltator, and Solenopsis invicta, our Blastn search 
detected 138 hits with £70% identity to the query over >100 bp 
in the pea aphid, Acyrthosiphon pisum (Hemiptera: Aphididae) 
genome assembly (AphidBase 2.1) (Table S3). However, because of 
the presence of many chimaeric elements, the acquired sequence 
regions as well as the proper boundaries of these Lepi -like 
sequences could not be precisely defined by multiple sequence 
alignment. Interestingly, one 662 bp EST sequence from the cotton 
aphid. Aphis gossypii (accession number: GW506388) also showed 
high identity with Lepl consensus sequence (89%) as well as 
HaLepl_8 (90%). 

Lepl consensus sequence was further used as query in Blastn 
searches against all the species with sequences deposited in the 
GenBank databases. A total of 278 significantly similar sequences 
to Lepl (>70% identity to the query over >100 bp) were identified 
in the genome shotgun sequence of Anoplophora glabripennis 
(Goleoptera: Cerambycidae). These sequences were subjected to 
pairwise alignment to reveal the boundaries and evaluated for the 
presence of structural features typical of Lepl Helitrons, of these, a 
total of 175 full length elements were identified and named 
AglaLepl_l to AglaLepl_l75 (Table S4). The consensus sequence 
of the AglaLepl is 209 bp long, shared 86% similarity with Lepl. It 
also has characteristic 5 ' -TG and 3 ' -GTRY nucleotide termini as well 
as GTRR motif at the 3' end of 65 bp acquired sequence. Compa- 
rative analysis showed that the match between the AglaLep 1 elements 
and their consensus sequence ranged from 95% to 100% (excluding 
indels), with a median similarity of 98%, suggesting a recent trans- 
position activity. 

Blastn searches using the Lepl consensus sequence as a query also 
yielded several significant hits in two parasitoid wasps, Cotesia vestalis 
and Copidosoma floridanum, as well as one microsporidia parasite, 
Nosema bombycis (Table 1). For example, two elements from C. ves- 
talis (CvLep 1_1 and CvLep 1_2) showed 90% and 86% identity with 
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Lepl, which are 190 bp and 201 bp in length including 62 bp and 
65 bp acquired sequence, respectively. In C. floridanum, two fuU length 
copies of lepl-like elements, CfLepl_l and CfLepl_2, were identified, 
which are 253 bp and 236 bp in length including 122 bp and 100 bp 
acquired sequence, and showed 75% and 69% identity with Lepl, 
respectively. Three full length copies of Lepl-like elements were also 
found in N. bombycis {NbLepl_l- NbLepl_i), which are 445, 208, 
218 bp in length including 314 bp, 76 bp and 84 bp acquired 
sequence, and showed 93%, 83%, 84% identity with Lepl, respectively. 

It is also noteworthy that we identified highly similar sequences 
in two polydnaviruses (PDVs), which are symbiotically associated 
with hymenopteran wasps, including three copies from C. vestalis 
bracovirus {CvBVLepl_l-CvBVLepl_3), four copies from Kitale 
(CsKBVLepl_l-CsKBVLepl_4) and Mombasa (CsMBVLepl_l- 
CsMBVLepl_4:) strains of Cotesia sesamiae bracovirus (Table 1). 
These elements vary in size from 196 bp {CsMBVLepl_l) to 
344 bp {CsKBVLep\_4). Pairwise comparisons of individual ele- 
ments reveal high sequence identity (82%-94%) with Lepl consensus 
sequence (Table 1). 

Overall, our BLAST searches detected significantly similar 
sequences to Lepl element in other non-lepidopteran species. 
While cross-species contamination is a concern, our Blastx analysis 
of the flanking sequences of the representative non-lepidopteran 
Lepl elements did not find any evidence of contamination (Table 
S5). The largest number of sequences with significant similarity to 
Lepl was identified in A. pisum and A. glabripennis. However, this is 
probably due to the abundant sequence resources for these two spe- 
cies compared with parasitoid wasps. The low copy number of Lepl- 
like element identified in N. bombycis and polydnaviruses might be 
explained by the low likelihood of fixation and rapid removal of 
nonessential DNA in their genomes'. 

Evidence of horizontal transfer of non-autonomous Lepl Helitrons. 

Traditionally, horizontal transfer has been implied when highly 
similar TEs have been found in distantly related taxa accompanied 
by their discontinuous distribution, and such phenomenon could not 
be explained in terms of vertical inheritance^'"^''. In this study, a 
patchy taxonomic distribution of Lepl was clearly revealed by 



database searches. While Lepl-Uke elements were detected in five 
non-lepidopteran insect species including two aphids (A. pisum and 
A. gossypii, Hemiptera), one beetle (A. glabripennis, Coleoptera), and 
two parasitoid wasps (C. vestalis and C. floridanum, Hymenoptera), 
no significant hits were observed in the genomes of R. prolixus 
(Hemiptera), T. castaneum (Coleoptera), N. bombycis and A. 
mellifera, as well as six ants (Hymenoptera). Remarkably, Lepl-like 
elements were also detected in one intracelluar microsporidia parasite, 
N. bombycis, and two bracoviruses which are symbiotically associated 
with hymenopteran parasitic wasps. In many cases, the sequence 
identity of the Lepl Helitrons is exceptionally high compared with 
the divergence of the hosts. For example, hymenopteran CvLeplJl 
showed 90% identity with lepidopteran Lepl consensus sequence, 
which diverged 325 million years ago (httpV/www.timetree.org/)^'', 
and CsBVLepl_l and NbLepl_l showed 94% and 93% identity with 
Lepl, respectively. 

In an effort to investigate the relationships within Lepl more 
closely, we reconstructed phylogenetic trees that focuses on these 
elements and representative lepidopteran Lepl elements. The results 
obtained with NJ and ML methods were mostly congruent. We chose 
to present the topologies obtained by NJ method (Fig. 1). The ML 
tree is provided in Figure S5. The result indicates the existence of two 
major clades (Fig. 1). The largest clade comprised Lepl-like 
sequences from bracoviruses, N. bombycis, C. vestalis, A. glabripen- 
nis, A. gossypii, and representative Lepl elements from B. mori 
(BmLepl_335 and BmLepl_87), Papilio dardanus (PdLepl_l), and 
H. armigera (HaLeplA and HaLeplB). Inside this clade, two sub- 
codes formed by CsKBVLepl_4, NbLepl_l, BmLepl_335, and A. 
gossypii AgosLepl_l, HaLeplA and HaLeplB, respectively, were 
strongly supported (100% and 99%), and CvLepl_l, CsMBVLepl_4, 
and CvBVLepl_2 were clustered together, with a bootstrap value of 
73%. In the second clade, the Lepl-Uke sequences from C. floridanum 
{CfLepl_l and CfLepl_2) were clustered with Trichoplusia ni 
TnLeplJl (FF372817), with a significant bootstrap value of 99%. 
These results suggested the occurrence of HT and that multiple 
mechanisms may underUe the horizontal spread of Lepl. 

While the inherent abilities of TEs to replicate and integrate into 
the host genome undoubtedly facilitate HT between organisms, the 
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CvLepl 1 


Cotesia vestalis 


GAKGO 1025082 


774-963 


190 


90 


CvLepl 2 


Cotesia vestalis 


GAKGO 1005573 


371-171 


201 


86 


CvBVLepl 1 


Cotesia vestalis BV 


HQ009543 


29755-29864 


196 


83 


CvBVLepl_2 


Cotesia vestalis BV 


HQ009537 


7652-7557 


224 


83 
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7743-7648 






CvBVLepl 3 


Cotesia vestalis BV 
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7651-7814 


218 


88 


CsKBVLepl 1 


Cotesia sesamiae KBV 


EF710635 


103544-103433 


213 


94 
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EF7 10634 


3978-3860 


196 


84 




EF710628 
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Cotesia sesamiae KBV 
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Cotesia sesamiae MBV 
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226 


94 


CsMBVLepl 4 


Cotesia sesamiae MBV 


EF7 10640 


4706-4846 


227 


82 


NbLepl 1 


Nosema bombycis 


AaZO 1002694 


334-778 


445 


93 


NbLepl 2 


Nosema bombycis 


AaZO 100 1444 


842-635 


208 


83 


NbLepl 3 


Nosema bombycis 


AaZ01001453 


267-484 


218 


84 


CfLepl 1 


Copidosoma floridanum 


JI831644 


358-106 


253 


75 


CfLepl_2 


Copidosoma floridanum 


JI839208 


363-128 


236 


69 


AgosLepl_l 


Aphis gossypii 


GW506388 


153-362 


210 


86 
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Figure 1 | Phylogenetic relationships among Lepl-like elements in non-lepidopteran species and representative lepidopteran insect species. The 

Neighbor-joining tree was generated in MEGA5 with 1000 bootstrapping. Bootstrap values below 50% are not shown. LepJ-like elements in 
non-lepidopteran species were derived from database homology searches, and the abbreviations and GenBank entries were described in Table 1 . 
Consensus sequences oiHaLepl lineage A (HaLepl ACS), HaLepl lineage B (HaLeplB CS), HaLepl lineage C (HuLeplC CS) and AglaLepl (AglaLepl CS) 
were derived from multiple sequences alignments in this study. Trichoplusia ni TnLep\_\ was obtained from database homology searches using CfLepl_l 
as query, and it's GenBank entry is FF372817. Other representative lepidopteran Lepl elements were derived from Coates et al.", and are obtained from 
the following GenBank entries: D86623.1 for Bombyx mori BmLepl_355, DQ242656.1 for B. mori BmLepl_87, CR974474 for Heliconius melpomene 
HmLep\_l, AC239123 for Bicyclus anynana BaLep\_\, FP340414 for Spodoptera frugiperda SfLepl_l, EU532470 for Ostrinia nubilalis OnLep\_\, 
FM995623 for Papilio dardanus PdLep\_\. Taxa showing Lepl are colored taxonomically, with lepidopteran insects in purple, Hymenoptera wasps in 
green, Hemiptera aphids in light blue, Coleoptera beetle in gray, bracoviruses in red, and Nosema bomhycis in orange. 



precise mechanisms underling HTT remain largely mysterious. 
Several hypotheses have been proposed to explain how TEs might 
be transferred between eukaryotic hosts. For example, TEs can puta- 
tively explore events like parasite mediated transfers from one host to 
another^^, as in the case of the mariner element transferred between 
the braconid parasitoid wasp, Ascogaster reticulatus, and its moth 
host, the smaller tea tortrix, Adoxophyes honmaP''. The little inter- 
specific sequence similarity of acquired sequences at 3' end makes 



Lepl a good candidate for the study of HTT mechanisms. In this 
study, the identification of Lepl Helitrons in C. floridanum and N. 
bombycis as well as their lepidopteran host insects is of particular 
interest. C. floridanum is a polyembryonic encyrtid that parasitizes 
the egg stage of T. ni and related moth species''''^''. The N. bombycis is 
well known as the causal agent of microsporidun disease pebrine of 
silkworm larvae, B. morP'^ Sequence comparison showed that, across 
the entire length of the elements, CfLeplJl showed 94% identity with 
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TnLepl_l, NbLepl_l showed 91% identity with BmLepl_335, and 
NbLepl_2 and NbLepl_3 showed 98% and 94% identity with 
BmLepl_87, respectively. Specifically, the acquired sequences of both 
NbLepl and CfLepl_\ showed over 90% identity with their lepidop- 
teran host Lepl elements (Fig. 2). Thus, our study provides evidence 
of the occurrence of HTT facilitated by host-parasite interactions. 

Putative directions of horizontal transfer of Lep\ Helitrons. The 

Polydnaviruses display an obligatory relationship with endoparasi- 
toid wasps belonging to the Braconidae family and Ichneumonid 
family, and have been proposed to be potential vectors for the 
delivery of TEs among species^". During the past few years, there 
have been several reports of TE-like sequences in the genomes of 
Polydnaviruses^'"''^. In this study, Lepl -like sequences were identified 
in C. vestalis bracovirus (CvBV), and C. sesamiae bracovirus from 
Kitale (CsKBV) and Mombasa (CsMBV) strains. These results 
suggested that Polydnaviruses might be important vectors of HT of 
Lepl Helitrons. Interestingly, Lepl-like sequences were also identi- 
fied in the parasitoid wasp, C. vestalis. Considering the widespread 
distribution of Lepl -like sequences in lepidopteran species, it is 
reasonable to propose that Lepl Helitrons were transferred from 
lepidopteran hosts to parasitoid wasps using polydnaviruses to 
mediate the actual transfer of TE DNA between cells. However, the 
acquired sequences of CvLepl and CvBvLepl showed only moderate 
similarity (72% between CvLepl_2 and CvBvLepl_3) (Fig. 3). This is 



possibly because of the current limited availability of C. vestalis 
sequence. C. vestali is larval parasitoid of the diamondback moth, 
Plutella xylostella (Lepidoptera: Plutellidae). However, we also did 
not find sequences similar to acquired sequences of CvLepl and 
CvBvLepl in the genome database of P. xylostella (http://iae.fafu. 
edu.cn/DBM/). Because in some cases, parasitoids are likely to 
oviposit within marginal (or even completely unsuitable) hosts in 
the laboratory or field, even if suitable hosts are present"", and C. 
vestalis has been reared from several species belonging to different 
lepidopteran families''^, we propose that CvLepl identified in this 
study may be transferred from other lepidopteran host to C. 
vestalis. This hypothesis could be partly supported by the fact 
observed in this study: the acquired sequence of CsKBVLep 1_4 
showed 90% similarity with BmLepl_335, suggesting that C. 
sesamiae might have oviposited within B. mori (Fig. S6). Alterna- 
tively, considering that the Braconidae wasps form a monophyletic 
assemblage named the microgastroid complex, which evolved 100 
million years ago, and BVs evolved from the interaction between the 
common ancestor of microgastroids and a single ancestral virus'*'^', 
the lepidopteran Lepl might repeatedly invade into the common 
ancestor of BV, and then horizontally transfer to Cotesia parasi- 
toids. This hypothesis could be supported by the facts observed in 
this study: the acquired sequence of CvBvLepl_l and CvBvLepl^ 
showed 88% similarity with CsKBVLepl_2 and CsMBVLepl_l, and 
CsKBVLepl_3 and CsMBVLepl_2, respectively (Fig. 3), suggesting 
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gctccg.aaactacBgaaccgatttgaaaaattc 






cgagtgac 




:tat 


utcttttttI 


aaaaaaa 


mm: 


rOAGTGAr 




"TAT 


\ATrTTTTTTGl 


aaaaaaa 



taatgtaaccc 
taatgtaaccc 



GTGCCCTGCGAAAACTATTGATGATAGAAnAAAATAATGTACTAClACTTTGTAGAACACATTATTATTTACAAAAA|TGT 
GTGCCCTGCGAA.1ACTATTGATGATAGAAiAAAATAATGTACTAClACTTTGTAGAACACATTATTATTTACAAAAAGTGT 



GTCTAACTATTATAGTTATGCCGCAATAA 
GTCTAACTATTATAGTTATGCCGCAATAAG 



CTTTTATTT 
CTTTTATTT 



AAA;' 




AAATAAAACAACGTCAAATAT TATATATACGT CGTTGAAAT 
AATAAAACAACGTCAAATAT^^^^MCGTTGAAAT 



gccggagcigiccgcta 
gccggagcIgIccgcta 



3)tag: 



gaaataaa 
agacaataa 



atattataaagaggaaagatttgtttgtttgtttgtttcgaataggctccgaaactactggaccgatttgaaaaahtatttttcc 
atattataaagaggaaagatttgtttgtttgtttgtttcgaataggctccgaaactactggaccgatttgaaaaattatttttcc 
atattataaagaggaaagatttgtttgtttgtttgtttcgaataggctccgaaactactggaccgatttgaaaaatttrttttcc 



BttattBtttttttttttttggtttcatgtgtgttttaatgtttccgaagcgaagQgagg 
fflatti\bttttetttttttggtttcatgtgtgttttaatgtttccgaagcgaagcgagg 
blttiittatttsttttttgttggtttcatgtgtgttttaatgtttccgaagcgaagcgagg 



B 



NbLepl_2 
NbLepl_3 
BmLepl_87 

NbLepl_2 
NbLepl 3 
BiiiLepl_87 



gcacttct 1 c 
ctcgtaat 1 )i 
taaacact 1 1 




TTAGAAGCCGACATTGTCCCTGATGAACATAGGCTACTTTTTT 
ATTAGAAGCCGACATTGTCCCTGATGAACATAGGCTACTTTTTT 
ATTAGAAGCCGACATTGTCCCTGATGAACATAGGCTACTTTTTT 



Nbl,epl_2 
NbLepl_3 
BraLepl_87 



GCGGGTCGCTAG 
GCGGGTCGCTAG 
GCGGGTCGCTAG 



ttctctataa 
tgtttaatat 
taaataaata 



CfLepl_l 
TnLepl.l 



ttatctata 
atggaagtai 



tctatactaatatataaagctgaagagtttgtttgtttgtttgaacgcgctaatctcaggaactactggtccaaattgaaaaaat; 
tctatactaatatataaagctgaagagtttgtttgtttg^Haacgcgctaatctcaggaactactggtccaaattgaaaaaat! 



^\I|iTT 



tgtgttgaat 
tgtgttgaat 



CfLepl_l 
TnLepl_l 

CfLepl_l 
TnLepl_l 




TTATATCTT 
TTATATCTT 



TTTAGGCTAT AAACCATCACGCTGC^ 
TTTAGGCT AT AAACCATCACGCTGO 



TAATAGGAGCGAAGATACAATGG 
TAATAGGAGCGAAGATACAATGG 



GAAGTCGCGGGCAACAGCTAG 
GAAGTCGCGGGCIACAGCTAG 



attagatat 
1 tataatag 



TATAAATCATAAC 
TATAAATCATAAC 



Figure 2 | Alignments of selected sequences from GenBank entries sharing high identity with Nosema bombycis NbLeplJl (A), NbLeplJl and 
NbLepl Ji (B) and Copidosoma floridanum CfLepl (C). Nucleotides shaded in black are conserved across sequences. Typical structural features of the 
Lepl elements including characteristic 5'-TC and 3'-CTRY nucleotide termini as well as CTRR motif at the 3' end of acquired sequence were boxed. 
Abbreviations and GenBank entries for these elements are described in Figure 1 . 
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CvLepl_l 

CvLepl_2 

CvBV_Lepl_l 

CvBV_LepL2 

CvBV_Lepl_3 

CsKBV_Lepl_l 

CsKBV_Lepl_2 

CsKBV_Lepl_3 

CsMBV_Lepl_l 

CsMBV_Lepl_2 

CsMBV_Lepl_3 

CsMBV_Lepl_4 



cgatagttaa 
tattaactta 
acaataataa 
agcacttcaa 
ttaatttata 
attaatgtaa 
gcaataataa 
ttaatttata 
gcaataataa 
ttaatttata 
ataggatata 
agcacttcaajl 



TCTATACT 
TCTATACT 
TjjTATACT 
TIITATail 

tctatact 
tctatact 
tctatact 
tctat.aHt 
tctatact 
tctataBt 
tctatact 
tatBct 




TGCATTGAATAGGCTCCGA 
TGCATT|AATA|GCTCCGA 
TGCATTGA/\lAGGCTi|C! 
TGSATT|AATiG|CTCCGA 

TGCATTGAATAGGCTCCGA 
TGCATTGAATAGGCTCCGA 
TGCATTGAATA|GCTCCGA 
TGCATTGAATAGGflTCCGA 
TGCATTGAATA|GCTCCGA 
TGCATTGAATAGGjjTCCGA 
TGCATTGAATAGGCTCCGA 
TGCATTGAATAGllCTCCGA 



CvLepl_l 

CvLepl_2 

CvBV_Lepl_l 

CvBV_Lepl„2 

CvBV_Lepl_3 

CsKBV_Lepl_l 

CsKBV_Lepl_2 

CsKBV_Lepl_3 

CsMBV_Lepl_l 

CsMBV_Lepl_2 

CsMBV_Lepl_3 

CsMBV_Lepl_4 




CACTGTTGG|AAGCTAC|CfflATCCCCAGGTGACATAGGCTATATT 
CACTGiTG|AAAGCTACACHiTCCCC|GGT|ACATAGGCTATATT 

CACTGTTGGAAAGCTAl|ACTATCCCCAG|rGACATAGG|lTAiATT 



cactgttgg|aagctacact|rccccbgtgac|ta|g||tatatt 

cactgHtggaaagctacaBtatcQccaggtIacataIgctatatt 
cactgttg|aaa1ctacactatccccBgtgacataggctatatt 

CACT|TTGGAi\A|cTACACTATCCCCAG8TGABATAGGiiT.'\iATT 

cactgHtggaaagct||cact|tcDccaggt|acata|gctatatt 
cact|ttggaaa|ctacactatccccagtrgabataggbta|att 
cactgHtggaaagct||cact|tcDccaggt|acata|gctatatt 
cactgttg|aaagctacactatccccbgtgacataggctatatt 
cactgttggIaagctacactItccccBgtgacItaggctatatt 



Btccaaaaaaaa 



Bcctaaaaaaaa 



CvLepll 
CvLepl_2 
CvBV_Lepl_l 
CvBV_Lepl_2 
CvBV_Lepl_3 
CsKBV_Lepl_l 
CsKBV^Lepl_2 
CsKBV_Lepl_3 
CsMBV^Lepl_l 
CsMBV_Lepl_2 
CsMBV_Lepl^3 
CsMBV_Lepl_4 

Figure 3 | Alignments of bracovirus and parasitoid wasp Xepl-like elements. Nucleotides shaded in black are conserved across sequences. Typical 
structural features of the Lepl elements including characteristic 5'-TC and 3'-CTRY nucleotide termini as well as CTRR motif at the 3' end of acquired 
sequence were boxed. Abbreviations and GenBank entries for these elements are described in Tablet. 




that Lepl -like element might insert into the common ancestor 
genome of these viruses. Additional experiments and taxon 
sampling are necessary to further determine the direction and 
frequency of HT of Lepl Helitrons. 

Other putative mechanisms underlying horizontal transfer of 
Lepl Helitrons. While our results indicate the role of host-parasite 
interactions in HT of Lepl, the presence of Lepl -like elements in A. 
glabripennis and A. pisum as well as A. gossypii is somewhat 
intriguing. Notably, a recent study also showed the occurrence of 
horizontal transfer of short interspersed nuclear elements (HaSE2) 
between Heliothine species and A. gossypii™. It has been proposed 
that mechanisms of HT include insect-associated facultative 
symbionts such as genera Wolbachia, Rickettsia, Cardinium, Arseno- 
phonus, and Sodalis"'*^. In addition to the possibility of HT through 
facultative symbionts, the Lepl -like elements identified in N. 
bombycis in this study suggested that the intracellular microspo- 
ridia parasite is also a potential vector for HT. It is reported that 
Wolbachia infect at least 20% of all insect species including 
aphids'"' and apart from the domesticated silkworms, N. 
bombycis can also infect various lepidopteran insects'*'''"^', indicative 
of their broad hosts range. Additionally, a previous study showed 
that A. glabripennis could be infected by microsporidia parasite, 
Nosema glabripennis^'^. Thus, we proposed that facultative sym- 
bionts including Wolbachia and obligate intracellular microspo- 
ridia parasites might play a role in the HT of Lepl -like elements in 



A. glabripennis and A. pisum as well as A. gossypii. More widespread 
sequencing would be required to find exact vectors that would 
facilitate the HT of Lepl Helitrons in these species. 

Methods 

DNA extraction and genome walking. A previous study has shown that TEs were 
enriched within or in close proximity to xenobiotic-metabolizing cytochrome P450 
genes in H. zea^'^. To isolate TEs in H. armigera, we performed genome walking to 
obtain the 5' -flanking sequence of an insecticide resistance-associated cytochrome 
P450 gene, CYP6AE12, in H. armigera'^. Genomic DNA was isolated from individual 
third instar larva, using the procedure described by Wang et al^^. Gene-specific 
primers based on the known sequence of the cDNA {accession number: DQ256407) 
and four general primers provided by the Genome Walking Kit (TaKaRa, Dalian, 
China) were used for every genome walking. PGR products were cloned into pGEM-T 
Easy vector (Promega, Madison, WI, USA) and sequenced. 

Database search strategy. Database searches were performed and comprise four 
steps. Firstly, the lepl-like element {named as HaLepl_l) identified in the 5'- 
flanking sequence of the H. armigera P450 gene, CYP6AE12, was compared with 
NCBI H. armigera nucleotide collection (nr/nt) databases with Blastn {www.ncbi. 
nlm.gov/cgibin/BLAST), sequences of high homology as well as 500 bp upstream and 
downstream flanking regions were extracted and analyzed for hallmarks of Lepl 
Helitrons such as characteristic 5'-TG and 3'-GTRY nucleotide termini as weU as 
CTRR motif at the 3 ' end of acquired sequence. Secondly, nucleotide (nr/nt) and EST 
{est_others) collections were searched using HaLepl_l as query to detect sequences 
with high identity with HaLepl_l in lepidopteran species other than H. armigera. 
Thirdly, the 134-bp lepidopteran-specific common sequence 3 (LSGS3, Lepl) was 
searched against non-lepidopteran insect genome sequences, including BeetleBase 
{http://beetlebase.org/), AphidBase {http://www.aphidbase.com/aphidbase/), 
NasoniaBase (http://hymenopteragenome.org/nasonia/), BeeBase {http:// 
hymenopteragenome.org/beebase/), vectorbase {https://www.vectorbase.org), and 
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Ant Genomes Portal (http://hymenopteragenome.org/ant_genomes/). Finally, The 
134 bp Lepl consensus sequence was compared with NCBI non-lepidopteran 
databases with Blastn, including the whole genome shotgun, nucleotide collection 
(nr/nt), genome survey sequences, high throughput genomic sequences, and 
expressed sequence tag databases. Hits that were ^70% identical to the query over 
>100 bp were examined and, when possible, full-length Lepl-like elements were 
manually extracted. These elements were used as queries to find additional related 
Lepl Helitrons, the resulting hits were examined, and full-length elements were 
extracted. 

Assessing polymorphism. In H. armigera, using one pair of primers flanking the 
insertion site, HaLepl_l insertion polymorphism was assessed by performing a PGR 
survey, which yielded products of different sizes in HaLepl_l insertion individual 
(about 700 bp) and non-insertion individual (about 500 bp). To further illustrate the 
mobility of other HaLepl elements, the insertion polymorphisms were also assessed 
by homology searches. Briefly, paralogous or orthologous sites not containing a 
HaLepl insertion (empty sites) were identified by homology searches utilizing Blastn 
with a query constructed from the sequences directly flanking the insertion site. The 
chimeric query sequence (about 200 bp in length) was created by extracting both the 
flanking sequence upstream from the element insertion (about 100 bp) and the 
flanking sequence downstream from the element insertion (about 100 bp). 

Sequence analysis. Multiple sequence alignments were performed using GlustalW^^ 
with default settings. Neighbor -joining (NJ) and maximum likelihood (ML, using the 
Tamura-Nei model) phylogenetic trees were constructed using Mega 5^^. The 
reliability of the NJ and ML tree topology was statistically evaluated by bootstrap 
analysis with 1000 replicates. To detect putative cross-species contamination during 
DNA sequencing, 10 kb sequences in each direction (upstream and downstream) of 
each representative non-lepidopteran Lepl insertion were extracted from the BAG 
clone sequences and used to search against the non-redundant databases using the 
NGBI server with Blastx (www.ncbi.nlm.gov/cgibin/BLAST). 
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